On-chip networks for manycore architecture
نویسنده
چکیده
Over the past decade, increasing the number of cores on a single processor has successfully enabled continued improvements of computer performance. Further scaling these designs to tens and hundreds of cores, however, still presents a number of hard problems, such as scalability, power efficiency and effective programming models. A key component of manycore systems is the on-chip network, which faces increasing efficiency demands as the number of cores grows. In this thesis, we present three techniques for improving the efficiency of on-chip interconnects. First, we present PROM (Path-based, Randomized, Oblivious, and Minimal routing) and BAN (Bandwidth Adaptive Networks), techniques that offer efficient intercore communication for bandwith-constrained networks. Next, we present ENC (Exclusive Native Context), the first deadlock-free, fine-grained thread migration protocol developed for on-chip networks. ENC demonstrates that a simple and elegant technique in the on-chip network can provide critical functional support for higher-level application and system layers. Finally, we provide a realistic context by sharing our hands-on experience in the physical implementation of the on-chip network for the Execution Migration Machine, an ENC-based 110-core processor fabricated in 45nm ASIC technology. Thesis Supervisor: Srinivas Devadas Title: Edwin Sibley Webster Professor
منابع مشابه
Cost-aware Topology Customization of Mesh-based Networks-on-Chip
Nowadays, the growing demand for supporting multiple applications causes to use multiple IPs onto the chip. In fact, finding truly scalable communication architecture will be a critical concern. To this end, the Networks-on-Chip (NoC) paradigm has emerged as a promising solution to on-chip communication challenges within the silicon-based electronics. Many of today’s NoC architectures are based...
متن کاملPerformance Guarantees in Manycore Embedded Systems based on Network-on-Chip
Embedded applications have strict performance guarantees that must be met in every scenario. Such guarantees have to be verified – formally or empirically – at design time, and the verification process usually guides design space exploration towards configurations that can meet those guarantees at the lowest cost and lowest energy dissipation. Manycore systems pose new challenges to this proces...
متن کاملA Distributed Run-Time Environment for the Kalray MPPA®-256 Integrated Manycore Processor
The Kalray MPPA ©-256 is a single-chip manycore processor that integrates 256 user cores and 32 system cores in 28nm CMOS technology. These cores are distributed across 16 compute clusters of 16+1 cores, and 4 quad-core I/O subsystems. Each compute cluster and I/O subsystem owns a private address space, while communication and synchronization between them is ensured by data and control Networks...
متن کاملEfficient Cache Coherence on Manycore Optical Networks
Ever since industry has turned to parallelism instead of frequency scaling to improve processor performance, multicore processors have continued to scale to larger and larger numbers of cores. Some believe that multicores will have 1000 cores or more by the middle of the next decade. However, their promise of increased performance will only be reached if their inherent scaling challenges are ov...
متن کاملEfficient On-Chip Pipelined Streaming Computations on Scalable Manycore Architectures
Performance of manycore processors is limited by programs’ use of off-chip main memory. Streaming computation organized in a pipeline limits accesses to main memory to tasks at boundaries of the pipeline to read or write to main memory. The Single Chip Cloud computer (SCC) offers 48 cores linked by a highspeed on-chip network, and allows the implementation of such on-chip pipelined technique. W...
متن کامل